Genomics and Machine Learning for Taxonomy Consensus: The Mycobacterium tuberculosis Complex Paradigm

نویسندگان

  • Jérôme Azé
  • Christophe Sola
  • Jian Zhang
  • Florian Lafosse-Marin
  • Memona Yasmin
  • Rubina Siddiqui
  • Kristin Kremer
  • Dick van Soolingen
  • Guislaine Refrégier
  • Joao Inacio
چکیده

Infra-species taxonomy is a prerequisite to compare features such as virulence in different pathogen lineages. Mycobacterium tuberculosis complex taxonomy has rapidly evolved in the last 20 years through intensive clinical isolation, advances in sequencing and in the description of fast-evolving loci (CRISPR and MIRU-VNTR). On-line tools to describe new isolates have been set up based on known diversity either on CRISPRs (also known as spoligotypes) or on MIRU-VNTR profiles. The underlying taxonomies are largely concordant but use different names and offer different depths. The objectives of this study were 1) to explicit the consensus that exists between the alternative taxonomies, and 2) to provide an on-line tool to ease classification of new isolates. Genotyping (24-VNTR, 43-spacers spoligotypes, IS6110-RFLP) was undertaken for 3,454 clinical isolates from the Netherlands (2004-2008). The resulting database was enlarged with African isolates to include most human tuberculosis diversity. Assignations were obtained using TB-Lineage, MIRU-VNTRPlus, SITVITWEB and an algorithm from Borile et al. By identifying the recurrent concordances between the alternative taxonomies, we proposed a consensus including 22 sublineages. Original and consensus assignations of the all isolates from the database were subsequently implemented into an ensemble learning approach based on Machine Learning tool Weka to derive a classification scheme. All assignations were reproduced with very good sensibilities and specificities. When applied to independent datasets, it was able to suggest new sublineages such as pseudo-Beijing. This Lineage Prediction tool, efficient on 15-MIRU, 24-VNTR and spoligotype data is available on the web interface "TBminer." Another section of this website helps summarizing key molecular epidemiological data, easing tuberculosis surveillance. Altogether, we successfully used Machine Learning on a large dataset to set up and make available the first consensual taxonomy for human Mycobacterium tuberculosis complex. Additional developments using SNPs will help stabilizing it.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Molecular Identification of Mycobacterium Tuberculosis Complex in Formalin-Fixed, Paraffin-Embedded Tissue Blocks of Extra Pulmonary Speciemens using Genomics Extraction

Background: Tuberculosis has been detected in some extra pulmonary ecological niches. Although extra pulmonary tuberculosis (EPTB) is less frequent than Pulmonary Tuberculosis (PTB), its incidence has increased worldwide. The aim of this study was to investigate the presence of EPTB and MDR-EXPT in Formalin-fixed, paraffin-embedded tissue blocks among different samples in Kerma...

متن کامل

Biochemical characterization of PE_PGRS61 family protein of Mycobacterium tuberculosis H37Rv reveals the binding ability to fibronectin

Objective(s): The periodic binding of protein expressed by Mycobacterium tuberculosis H37Rv with the host cell receptor molecules i.e. fibronectin (Fn) is gaining significance because of its adhesive properties.  The genome sequencing of M. tuberculosis H37Rv revealed that the proline-glutamic (PE) proteins contain polymorphic GC-rich repetitive sequences (PGRS) which have clinical importance i...

متن کامل

تشخیص سریع مایکوباکتریوم‌های آتیپیک در بیماران با علایم سل ریوی: ارزیابی لوکوس (QUB 3232 (590bp با روش VNTR

Background and Objective: Identification of atypical mycobacterium (Non tuberculosis Mycobacterium NTM) is important because of the worldwide propagation of these organisms. Recently, molecular studies have identified the specific loci for mycobacterium species by DNA - finger printing methods, but these methods are time-consuming and expensive. In this study, in addition to hsp65 PCR-RFLP meth...

متن کامل

Identification of Mycobacterium Tuberculosis Complex, Using Molecular Methods

Abstract Background and Objective: A high level of homogeneity observed within all bacteria in the Mycobacterium tuberculosis complex makes a property that seriously challenges traditional biochemical-based identification methods of these pathogens in the laboratory. The work presented here was conducted to characterize Mycobacterium tuberculosis complex isolates in Golestan, Northern Iran. ...

متن کامل

ارزش تشخیصی تست gyrB-RFLP PCR در تعیین گونه مایکوباکتریوم‌های بیماریزا در بیماران مسلول در استان مازندران

Background and purpose: Mycobacterium tuberculosis complex (MTBC) members are causative agents of human and animal tuberculosis. Differentiation of MTBC members is essential for appropriate treatment of individual patients and reduce drug resistance. Materials and methods: A total of 1345 samples were collected from patients clinically suspected of contracting tuberculosis that referred to hea...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 10  شماره 

صفحات  -

تاریخ انتشار 2015